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Preamble 



The graph isomorphism problem has a long history in mathematics and computer 
science, and more recently in fields of chemistry and biology. Graph theory is a 
branch of mathematics started by Euler as early as 1736 with his paper The seven 
bridges of Konisberg. It took a hundred years before other important contribution 
of Kirchhoff had been made for the analysis of electrical networks. Cayley and 
Sylvester discovered several properties of special types of graphs known as trees. 
Poincare defined what is known nowadays as the incidence matrix of a graph. It 
took another century before the first book was published by Denes Konig at 1936 
titled Theorie der endlichen und unendlichen Graphen. After the second world war, 
further books appeared on graph theory, for example the books of Ore, Behzad and 
Chartrand, Tutte, Berge, Harary, Gould, and West among many others. 

The graph isomorphism problem is the computational problem of determining 
whether two finite graphs are isomorphic. Besides it's practical importance, the 
graph isomorphism problem it's one of few problems which belonging to NP neither 
known to be solvable in polynomial time nor NP-complete. It is one of only 12 
such problems listed by Garey & Johnson(1979), and one of only two of that list 
whose complexity remains unresolved (the other being integer factorization). It 
is known this computational problem is in the low hierarchy of class NP, which 
implies that it is not NP-complete unless the polynomial time hierarchy collapses 
to its second level. Since the graph isomorphism problem is neither known to 
be NP-complete nor to be tractable, researchers have sought to gain insight into 
the problem by defining a new class GI, the set of problems with a polynomial- 
time Turing reduction to the graph isomorphism problem (5j. In fact, if the graph 
isomorphism problem is solvable in polynomial time, then GI would equal P. 

The best current theoretical algorithm is due to Eugene Luks (1983) and is 
based on the earlier work by Luks (1981), Babai and Luks (1982), combined with 
a subfactorial algorithm due to Zemlyachenko (1982). The algorithm relies on the 
classification of finite simple groups, without these results a slightly weaker bound 
2°(v" 1 °g ") was obtained first for strongly regular graphs by Laszlo Babai (1980), 
and then extended to general graphs by Babai and Luks (1982), where n is the 
number of the vertices. Improvement of the exponent y/n is a major open problem; 
for strongly regular graphs this was done by Spielman (1996). 

There are several practical applications of the graph isomorphism problem, for 
example, in chem-informatics and in mathematical chemistry; graph isomorphism 
testing is used to identify a chemical compound within a chemical database. Also, 
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in organic mathematical chemistry graph isomorphism testing is useful for gener- 
ation of molecular graphs and for computer synthesis. Chemical database search 
is an example of graphical data mining, where the graph canonization approach is 
often used. In particular, a number of identifiers for chemical substances, such as 
SMILES and InChI, designed to provide a standard and human-readable way to 
encode molecular information and to facilitate the search for such information in 
databases and on the web, use canonization step in their computation, which is es- 
sentially the canonization of the graph which represents the molecule. In electronic 
design automation graph isomorphism is the basis of the Layout Versus Schematic 
(LVS) circuit design step, which is a verification whether the electric circuits repre- 
sented by a circuit schematic and an integrated circuit layout are the same. Other 
application is the evolutionary graph theory, which is an area of research lying 
at the intersection of graph theory, probability theory, and mathematical biology. 
Evolutionary graph theory is an approach of studying how topology affects evo- 
lution of a population. That the underlying topology can substantially affect the 
results of the evolutionary process is seen most clearly in a paper by Erez Lieber- 
man, Christoph Hauert and Martin Nowak. 

So, it's important to design polynomial time algorithms to test if two graphs 
are isomorphic at least for some special classes of graphs. An approach to this was 
presented by Eugene M. Luks(1981) in the work Isomorphism of Graphs of Bounded 
Valence Can Be Tested in Polynomial Time. Unfortunately, it was a theoretical 
algorithm and was very difficult to put into practice. On the other hand, there is no 
known implementation of the algorithm, although Galil, Hoffman and Luks(1983) 
shows an improvement of this algorithm running in (9(n 3 logn). 

The two main goals of this master thesis are to explain more carefully the 
algorithm of Luks(1981), including a detailed study of the complexity and, then 
to provide an efficient implementation in SAGE system. It is divided into four 
chapters plus an appendix. 

Chapter 1 mainly presents the preliminaries needed to follow the rest of the 
dissertation. This chapter contains three sections, the first section introduces the 
topics about group theory, in particular the symmetric group, and the second one 
introduces the main definitions and results of graph theory. Then, the last shows 
the complexity theory concepts. 

Chapter 2 is devoted to collect some basic algorithms in group and graph theory 
for later use. 

Chapter 3 is the main part, and it is dedicated to clarify carefully the trivalent 
case and the complexity of the algorithm. The last section extends the algorithm 
to a general case. 

Finally, Chapter 4 deals with the implementation test. 

Appendix A is dedicated to the documentation of the implementation in SAGE 
system. 



Preliminaries 



This chapter gives a gentle yet concise introduction to most of terminology used 
later in this master thesis. 

1.1 Group theory background 

We will focus on the theory of groups concerning the symmetric group, for further 
background we refer the reader to [T3j . 

The symmetric group of a finite set A is the group whose elements are all the 
bijective maps from A to A and whose group operation is the composition of such 
maps. In finite sets, "permutations" and "bijective maps" act likewise on the group, 
we call that action rearrangement of the elements. 

The symmetric group of degree n is the symmetric group on a set A, such as 
| A |= n, we will denote this group by S n , or if the set A requires explanation by 
Sym(A). 

Since a cycle (i\ . . . i r ) can be written as a product of transpositions; S n is 
generated by its subset of transpositions. But, except for the case n = 2, we 
don't need every transposition in order to generate the symmetric group, since for 
1 < j < k < n, we have 

(j k + 1) = (A; k + k){k k + l) 

Thus the transposition (j k + l) can be obtained from (j k) and (k k + l). 
Therefore the subset 

S = {(i i + 1) 1 < i < n} 

consisting of the elementary transpositions, generates S n . A further system of 
generators of S n is obtained from the expression 

(1 . . . n)\l 2)(1 . . . n) 1 = (i + 1 i + 2) 1 < i < n - 2 

so that we have proved that the symmetric group S n is generated by permuta- 
tions (1 2) and (1 ... n). 

A permutation group is a finite group G whose elements are permutations of a 
given set and whose group operation is composition of permutations in G, i.e., a 
permutation group is a subgroup of the symmetric group on the given set. 
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We will say that a subset T of Sym(A) stabilizes a subset B of A if u{B) = B 
for all cr g G. If C is a group and G stabilizes a subset i?, we will say that G acts 
on B, i.e. we have an homeomorphism from G to Sym(B). An action C over B is 
called faithful if the homomorphism is injective. 

Definition 1. If G acts on B and b & B, the G-orbit of 6 is the set (?& = {c(6) 

We say that a group G acts transitively on 5 if B = for some b E B. Note 
that if i? = Gft for some b £ B, then B = Gb for all 6 G 5. 

Definition 2. A G -block is a subset i? of ^4,5 7^ 0, such that, for all cr g G, 
a(B) = B or tr(5) PI 5 = 0. 

In particular, the sets A and all 1-element subsets of A are blocks, these are 
called the trivial blocks. An example of non-trivial blocks in a group that no act 
transitively on A, are the G— orbits 1 . 

If B is a G— block, then a G— block system is the collection {cr(B) | a G G} 

Example 1. Let n = 4 and G = {id, (13) (24), (14) (23), (12) (34)} then the set 
{1,3} is a G— block and the collection {{1,3}, {2,4}} is a G— block system. 

The action G is said to be primitive if the only G— blocks are the trivial blocks. 
We have that the G— orbits are G— blocks, so if G ^ Id acts primitively on A then 
G acts transitively. In the case that G acts transitively the G— blocks are called 
block of imprimitivity . 

A G— block system is said to be minimal if G acts primitively on the blocks. 
In the previous example the G— block system is minimal. Note that the number 
of blocks in a minimal G— block system is not, in general, uniquely determined. 
However, we have the next result. 

Lemma 1. Let P be a transitive p— subgroup of Sym(A) with | A |> 1. Then 
exists a P— block system consists of exactly p blocks. Furthermore, the subgroup, 
P' , which stabilizes all of the blocks has index p. 

Proof. The quotient P/P' is a primitive p— group (acting on the blocks) and so the 
order of P/P' =number of blocks = p [8l p. 66] □ 

Thanks the above lemma, if P is a 2— subgroup of Sym(A), then exists Bi, B 2 
such A = Bi U B 2 where Bi and B 2 are P — blocks. 

1.2 Graph theory background 

Fortunately, much of standard graph theoretic terminology is so intuitive that it is 
easy to remember. 

l a{G b ) = GbVo- e G 
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A graph is a pair G = ( V, E) of sets that C F 2 ; thus, the elements of E are 
2— element subsets of V. The elements of V are the vertex of the graph G; and the 
elements of E are the edges . 

Note 1. If we consider vertices as 2— tuples, we have a digraph in the example below 
we can see the differences between a graph and a digraph. 

Example 2. Take £ = {1,2,3,4} and V = {(1, 2), (1, 3), (1, 4)} then the graph 
G is the graph that we can see in Figure 1.1 and the digraph is the graph that we 
can see in Figure 1.2 




Figure 1.1: Graph with V = {1,2,3,4} and E = {(1, 2), (1, 3), (1, 4)}. 

Note 2. Note that the graph G = ({1, 2, 3, 4}, {(1, 2), (1, 3), (1, 4)}) is the same 
graph that G' = ({1, 2, 3, 4}, {(1, 2), (3, 1), (1, 4)}), but if we consider G, G' as 
digraph they are not the same digraph. 

The vertex set of a graph G is referred to as V(G) and the edge set as E(G). 
These conventions are independent of any actual name of these two sets, for example 
if we define a graph H = (W, F) the vertex set of the graph is still referred to as 
V(H), not as W(H). If there is no possible confusion we don't distinguish between 
the graph and the vertex set or the edge set; for example we say a vertex v G G 
and an edge e G G. 

Definition 3. If G is a graph, then two vertices ei, e 2 £ E(G) are neighbors if 
(ex, e%) G V(G). If we have a digraph we said that e\ is a successor of e-i or, 
equivalently, ei is a predecessor of e% if (ei, e 2 ) G V(G) 

Another well known concept is the following: 
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Figure 1.2: Digraph with V = {1, 2, 3, 4} and £ = {(1, 2), (3, 1), (1, 4)}. 

Definition 4. A path in a graph is a sequence of vertices such that from each of 
its vertices there is an edge to the next vertex in the sequence. A cycle is a path 
such that the start vertex and end vertex are the same. The choice of the start 
vertex in a cycle is arbitrary. 

A special family of graphs are: 

Definition 5. In a graph G, two vertices u and v are called connected if G contains 
a path from u to v. A graph is said to be connected if every pair of vertices in the 
graph is connected. A directed graph is called weakly connected if replacing all of 
its directed edges with undirected edges produces a connected graph. 

We also need the following two concepts: 

Definition 6. In an undirected graph G, the degree of a node v e V(G) is the 
number of edges that connect to it. In a directed graph, the in-degree of a node 
is the number of edges arriving at that node, and the out-degree is the number of 
edges leaving that node. 

Definition 7. We define the valence of an undirected graph G as max veV , G Jdeg(v)) 

Using the above definitions, we can state the following well known result: 

Proposition 1. Let X a connected graph with valence t then 

| E(X 2 ) \<\ V(X 2 ) | -t 

Proof. Every node v G V(X) is connected with at most t nodes, then for each 
node, are at most t edges connected to v. □ 
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The following is a natural definition: 

Definition 8. Let G = ( V, E) and G' = ( V, E') be two graphs or digraphs. We 
say that G and G' are isomorphic if there exists a bijection if : V — > V such as 
(x, y) G E <^> (</?(z), </?(?/) G £" for all ijG V. 

The previous map ip is called an isomorphism, if G = G', it is called an attto- 

Proposition 2. Lei (7 = (V,i£) a graph, the set of automorphisms, Aut(G), 
define a permutations group. 

Proof. We only need see that Aut(G) is a subgroup of Sym(V). 

• If <f G ^^(G) then y? -1 is an automorphism because is bijective and if we 
have 

(x,y) EE^ (<p(x),ip(y) G E 
then if we apply y?" 1 in both edges 

(ip-\x),ip-\y)) E E ^ (x,y) e E 

• If if, if' G Ait^G) then 

(x, y)eE^ (ip(x), <p(y)) G£« (¥%(a0), tfMv))) e ^ 

then yl'ui(G') is closed under inverses and products, so Aut(G) is a subgroup of 
Sym(V) and therefore ^4iti(G) is a permutation group. □ 

The above result suggest the following notation. 

Definition 9. We denote by Aut e (G) the subgroup of Aut(G) such as fix the edge 
e, ie, W ip G Aut e (G) if e = {v\, V2) then (p(vi) = vi and tpiyA = v\ or (fi(vi) = v± 
and (f(v 2 ) = v 2 . 

The following example illustrates the above concepts: 

Example 3. Let G the graph of Example 1, then Aut(G) = ((2,3), (2,4), (3,4)) 
and if e = (1,2), Aut e (G) = ((3,4)). If we consider the digraph, then Aut(G) = 
((2,4)) and Aut e (G) = Id. 

Definition 10. A tree is a finite, connected, acyclic graph, we say that a tree is 
rooted if it has a distinguished node, called root. In a rooted tree, the parent of a 
node x is the unique node adjacent to x which is closer to the root, the children of 
a node are the nodes of which x is the parent; a node x is an ancestor of a node y 
if the shortest path from y to the root contains x, in this case we also say y is a 
descendant of x. 
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In a tree T we have two type of vertices: leaves L(T), terminal nodes, they 
belong to a single edge, in a rooted tree a leaf is a node without children; and 
interior nodes Int(T) 

Definition 11. A phylogenetic tree is a triplet ( T, p, {u±, . . . , u n }) where T is a 
tree with n leaves, {ul, . . . , u n } is a set of different species (or taxa), and p : 
. . . , u n } — > L( T) is a bijection. 

In the literature the leaves represent current species and the interior nodes 
represent ancestral species. The tree records the ancestral relationships among the 
current species. 

Definition 12. By a evolutionary network on a set S of taxa we simply mean a 
rooted directed acyclic graph, with its leaves bijectively labeled in S. 

A tree node of an evolutionary network N = ( V, E) is a node of in-degree at 
most 1, and a hybrid node is a node of in-degree at least 2. A tree arc {hybridization 
arc ) is a path such that the start vertex is a tree node (hybrid node). As in tree, 
a node v G V is a child of u G V if (u, i>) G we also say in this case that u is a 
parent of u, note that in this case a node can have more than one parent. 

Definition 13. An evolutionary network is binary when its hybrid nodes have 
in-degree 2, out-degree 1 and internal tree nodes have out-degree 2. 

An isomorphism between two rooted trees T\ and T 2 is an isomorphism from 
T\ to T 2 as graphs that sends the root of T\ to the root of T 2 . An isomorphism 
between phylogenetic trees or evolutionary networks also preserves the bijection 
p, ie, let ip : V(Ti) — > V(T 2 ) an isomorphism between (T ± , pi, {ui, . . . , u n }) and 
(T 2 ,P2, {ui, u n }), then v?(pi(^)) = p 2 (ui) Wi = 1, . . . , n. If T x and T 2 have 
roots rl, r 2 respectively, we also require that </?(ri) = r 2 . 

1.3 Computational complexity theory background 

Computational complexity theory is a branch of the theory of computation in theo- 
retical computer science and mathematics that focuses on classifying computational 
problems according to their inherent difficulty, and relating those classes to each 
other. Many important complexity classes can be defined by bounding the time 
or space used by the algorithm. Some important complexity classes of decision 
problems defined by bounding space are the following: 



Complexity class 


Model of computation 


Resource constraint 


DTIME(f(n)) 
P 

EXPTIME 
NTIME(f(n)) 
NP 

NEXPTIME 


Deterministic Tuning Machine 
Deterministic Tuning Machine 
Deterministic Tuning Machine 
Non-deterministic Tuning Machine 
Non-deterministic Tuning Machine 
Non-deterministic Tuning Machine 


Time f(n) 
Time poly(n) 
Time 2 poly ^ 

Time f(n) 
Time poly(n) 
Time 2 poly ^ 
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We will focus on the class P, also known as PTIME. PTIME is one of the 
most fundamental complexity classes, it contains all decision problems witch can 
be solved by a deterministic Turing machine using a polynomial amount of com- 
putation time, or polynomial time. Cobham's thesis holds that P is the class of 
computational problems which are "efficiently solvable" or "tractable" ; in practice, 
some problems not known to be in P have practical solutions, and some that are 
in P do not, nut this is a useful rule of thumb. 

A more formal definition of P is 

Definition 14. A language L is in P if and only if there exists a deterministic 
Turing machine M , such that 

• M runs for polynomial time on all inputs 

• For all x E L, M outputs 1 

• For all x L, M outputs 

1.3.1 Reducibility 

Intuitively, a problem Q can be reduced to another problem Q' if any instance 
of Q can be "easily rephrased" as an instance of Q', the solutions which provides 
a solution to the instance of Q. For example, the problem of solving equations 
linear equations in an indeterminate x reduces to the problem of solving quadratic 
equations. Given an instance ax + b = 0, we transform it to Ox 2 + ax + b = 0, 
whose solution provides a solution to ax + b = 0. Thus, if a problem Q reduces to 
another problem Q', then Q is, in a sense, "no harder to solve" than Q'. 

Definition 15. If exists a polinomial-time algorithm F that computes this "rephras- 
ing", then we say that Q is polynomial-time reducible to Q' . 

Then if we can solve the problem Q' in polynomial time, we can solve the 
problem Q. This technique is very useful because, generally, is easy find a easier 
problem that is polynomial time reducible to our initial problem. 

Example 4. Solving linear equations in an indeterminate x clearly reduces in 
polinomial time to the problem of solving quadratic equations 
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In this chapter we introduce some basic algorithms. The first section contains 
algorithms in group theory that we will use in the main algorithm. In the second 
section we will present an algorithm to test if two phylogenetic trees are isomorphic, 
with this example we will see that sometimes the isomorphism problem is easy. 

2.1 Algorithms in group theory 

Since every subgroup of S n can be generated by at most n elements [9], any sub- 
group of S n can be specified in space which is polynomial in n. 

Lemma 2 (Furst-Hopcroft-Luks). Given a set of generators for a subgroup G of 
S n one can determine in polynomial-time 

1. the order of G 

2. whether a given permutation a is in G 

3. generators for any subgroup of G which is known to have polynomially bounded 
index in G and for which a polynomial-time membership test is available. 

Proof. Let G a subgroup of S n , denote by Gi the subgroup of G which fixes the 
numbers in {1 , . . . , i} . Thus we have a chain of subgroups 

1 = G„_i C-CG 1 CG =G 

Now we construct a complete sets of coset representatives, Cj = Gi modulo 
Gi+i < i < n — 2, then | G \= 11^=0 I I- The main part of this construction is 
the subroutine Algorithm 1. The input is an element a G G, the lists Gi contain 
sets of left coset representatives for Gi modulo Gj+i. 

Thus the subroutine searches for a representative of the coset of a modulo G i+ \ 
in the list Cj. If it is not found, then a represents a previously undiscovered coset 
and it is added to the list. If it is found as 7 then / y~ 1 a is in Gi and its class 
modulo G i+ i is sought in Cj. Since, for a e Gi, membership in G i+ i is testable in 
constant time ( we only need see if a(i + 1) = i + 1 ), the procedure requires only 
polynomial time. 

The algorithm for the first part of lemma is now easily stated: 
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1. Initialize C, {1} for all i. 

2. Filter the set of generators of G. 

3. Filter the sets CjQ with z > j. 

Algorithm 1: Filter 

Data: aGG 

Result: Add a to his C, 
i begin 



2 
3 
4 

5 
6 
7 



for i G [0, re - 2] do 

if 3 7 G Cj : 7 _1 a G Gj+i then 
j ct 7 _1 a 

else 

add a to Ci 
return 

return 



Of course, the calls to the subroutine may result in an increase in some 
thus demanding more runs of (3). However, we know a priori that, at any stage, 
I Ci \<\ Gi : Gi + \ |< re — i. Thus the process terminates in polynomial time. The 
result of (2) is that the original generating set is contained in C Ci . . . C n _ 2 - The 
actual outcome of (3), given (1), is that GiCj C CjC j+ i ■ ■ ■ C„_ 2 - These facts can 
be used to prove that G = C Ci . . . C n „ 2 - That Ci represents Gi modulo G i+ \ is 
then immediate. 

By the first part of lemma, the second is an immediate consequence of the 
fact: a G (<&) ($,er) | = | ($) |. Wehave that this membership test might be 
implemented by a construction of the lists C, for ($) followed by the call Filter (a). 
Then a G G if and only if it doesn't force an increase in some Cj. 

For the last part of lemma, we alter the group chain to 

1 = #„-i C ■■■ H 2 C Hi C H C G 

and apply the same algorithm to generate complete sets of coset representatives. 
Note that the polynomial index of H in G and the requirement that the membership 
in H be polynomially decidable guarantees again that the entire process takes 
only polynomial time. Ignoring the first list, the remaining lists comprise a set of 
generators for H . □ 

Remark 1. The complexity of the algorithm Filter is 0(n 5 ) because at most there 
are O^n 4 ) 1 elements in the union of Cj and we need an extra re to check whether an 
element is in its corresponding Cj. The complexity of the third part of the lemma 
is 0(n 5 ) ■ 0(test membership in H) 

i Eo< l < J <™- 2 ( ri - o(n-i) is in °<y) 



2.1. ALGORITHMS IN GROUP THEORY 
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We will need, in the transitive case, to be able to decompose the set into non- 
trivial blocks of imprimitivity. To be precise, we fix a G A and for each b G A, 
b 7^ a, we generate the smallest G— block containing {a, b}. 

Proposition 3. (JT3$ ) The smallest G— block containing {a, b} is the connected 
component of a in the graph X with V(X) = A and E(X) is the G— orbit of {a, b} 
in the set of all (unordered) pairs of elements of A. 

If G is imprimitive, the block must be proper for some choice of b, in that case, 
the connected components of X define a G— block system. Repeating the process, 
we actually obtain an algorithm for the following computational problem. 

Lemma 3. Given a set of generators for a subgroup G of S n and a G— orbit B , 
one can determine in polynomial time, a minimal G— block system in B. 

Thanks to Atkinson [2], we have the Algorithm 2, that is a particularly efficient 
implementation of the above ideas. 



Algorithm 2: Smallest G— block which contains {1, u} 
Data: u ^ 1, G = (g 1 , . . . , g m ) 

Result: The smallest G— block which contains {l,w} 
begin 

C <- 

Set f(a = a) Va G A 
Add oj to C 
Set f(u) = 1 

while C is nonempty do 
Delete /3 from C 
a<-f(P) 

while j < to do 

3 + + 
1 <- a 9j 
5 = fig, 

if /(T) ±f(S) then 

Ensure f(5) < /(7) by interchanging 7 and 5 if necessary, 
for e : /(e) =7(7) do 
L Set/(e)=/(*) 

Add/ (7) to C. 

return C 



Let fo be the initial function /, and fx, . . . ,f r = f be the variants of / defined 
by the last For. Associated with each function fi is a partition of A, each part 
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of Pi consists of elements on which f, L takes the same value, ie, if B G Pi, then 
V a, (3 G B, fi(ct) = fi(f3). Also we have that P i+ \ is obtained from Pi by replacing 
to parts of Pi by their union; in particular, P 4 is a refinement of P i+ i as every part 
of Pi is contained in a part of Pj+i. 

We denote by Pi(a) the part of P, which contains a. 

Lemma 4. 1. ///,(«) = ^en /» = Vj > i. 

2. /<(/<(«)) =/i(a) Va G A and V z > 0. 

Proof. 1. The proof of this part is obvious by construction, because if we change 
we also change by the same value. 

2. Clearly fo(a)P (a) for all a, and it is also evident that fi(a) G Pi (a), so 
a,/i(a) G P i (a) then /-(a) =/i(/i(a)). 

□ 

Lemma 5. 1. a> /o(a) > > • • • > /(a) 

2. A pom£ (3 belonged to C if and only if P ^ f(P)- 

3. If (3 belonged to C , then there exists a < (3 with /(a) = f((3) and f(atgj) = 
fiP9j) , J = l,...,m. 

Proof. 1. The first step ensures that a > /0(a) and the step before the For 
instruction ensures that fi(a) > fi + i(a). 

2. The points of C are added in the line 4 and 18. In the line 4, /3 — ou and 
uj > fo(uj) = 1 = f(uj). In the line 18, (3 = fi{^) for some i and 7; then 
fi{(3) = (3 = /i(7) and < Conversely, if /3 > then clearly 

= /? belonged to C. 

3. Let a be the point defined in line 8, when (3 is deleted from C . Then a = 
fi(/3) < (3 for some i. Moreover, by the previous lemma, /«(«) = fi{(3) and so 
/(a) = f(/3). Finally, after line 18 for a given j, fk(otgj) = fk(Pgj) for some k 
and so f(agj) =f(0gj). 

□ 

Lemma 6. P = P r is invariant under G. 

Proof. It is sufficient to prove that each gj preserves P. Suppose that there exists 
a, b G A, a ^ b with f(a) = f(b) but f(gj(a)) 7^ f(gj(b)), with & minimal. Then 
f(b) = f(a) < a < b and so b belonged to C. Hence there exists c < b with 
f(c) = f(b) and f(gj(c)) = f(gj(b)). Since f(c) = f(a) and c < b, ensures that 
f(9]( c )) = /(»(«))■ Thus f(9j( b )) = f(9j(c)) = ]i,9]i, a )) ^ is a contradiction. □ 

So A = P(l) is a block of G containing 1 and uj. As G is transitive and the 
previous lemma states that P is G— invariant, then P is the block system containing 
A. 
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Lemma 7. A is the smallest block containing 1 andoj. 

Proof. Let Ai be the smallest block containing 1 and u such that Ai C A. Let 
P = {g(Ai) | g G G}. Then P is a partition of A; we now prove that each Pj 
is a refinement of P by induction on i. This is clearly true if % — 0. Assume now 
that i > 0, Pj is a refinement of P and consider a part of Pj+i. Such a part is 
either a part of Pj or the union of two parts of Pj of the form Pj(/i( gamma)) U 
P(/ 4 (5)) = P( 7 ) U Pi(8) where 7 = a#,5 = 9j and P(«) = P0). By an 
inductive assumption, P(a) = P((3). Then 

P( 7 ) = P(g 3 (a)) = P{gj(JS)) = P(5) D PMi)) U P,0l(5)) 

This completes the induction and we have A = P(l) = P r (l) Q -P(l) = A^. □ 

Remark 1 . There are several ways in which the algorithm can be made faster, we 
can see it in [2]. 

In our applications it will be necessary to determine the subgroup of G which 
stabilizes all of the blocks. 

Lemma 8. Given a set of generators for a subgroup G of S n and a G— orbit B , 
one can determine, in polynomial time, a set of generators for the subgroup of G 
which stabilizes all of the blocks in a G— block system in B . 

Proof. The third part of the lemma 2 guarantees this. Let Gj denote the subgroup 
which stabilizes each of the first i blocks. Then ( taking G = Gq ) 

| Gi : Gi + \ | < number of blocks — % 

□ 

2.2 Algorithms in graph theory 

If two rooted phylogenetic trees are isomorphic can be tested easily, using the extra 
information that we have ( (p(j)\(uif) = p<2.{iii). Thanks to this extra information 
we have the Algorithm 3 

The subroutine PostOrderlterator returns an iterator of the nodes of T\ in 
postorder, i.e., first the leaves, then the parents of the leaves and so to get to the 
root. 

Lemma 9. The previous algorithm terminates in linear time. 

Proof. Let n the number of leaves and m =\ V(Ti) |, then in the algorithm we 
first made the iterator PostOrderlterator it can be made in O(m), because each 
node has to be visited at least once and increases linearly for increasing m. Then in 
the loop, first we do n trivial operations, corresponding to assigning v G L(T\) to 
its corresponding v' G L(T 2 ), this operation is O(n). Then for every w G Int(Ti) 



20 



CHAPTER 2. BASIC ALGORITHMS 



Algorithm 3: PhylogeneticTreelsomorphism 



Data: 7\ = {m, . . . , u 2 })andT 2 = (T 2 , p 2 , 

Result: Test if T\ and T 2 are isomorphic 
begin 

Set ip(p 1 (u i )) = p 2 (ui) \/i 
Nodes <- PostOrderlterator(Ti) 
w <— Nodes. next () 
while Nodes. hasNext() do 
if u; «s noi a tea/ then 
if <p(w) ==none then 
v child of u; 

Set p(w) = parent(p(v)) 

for i> child of w do 

if <p(w) ^parent (p(v)) then 
j return False 

return (p 



we do O (child (w)) operations to check if the isomorphism is correct, so we made 

| child(w) |= (m — re) = m — 1. Then 

welnt(L) 

the complexity is O(re) + 0(n) + (9(m — 1) = 0(m). □ 

This algorithm can be used to test if two evolutionary network are isomorphic, 
because we can reduce the size of the network by removing the part that is tree-like. 



Chapter 3 



Trivalent Case 



In this chapter we will see an extend explication of the problem when the valence of 
the graphs is 3, and at the end of chapter we will show a generalization to general 
case. The cases with n = 1 and n = 2 are trivial because for n — 1 we only have 
one connected graph with valence 1, the graph with 2 nodes linked by 1 edge; and 
the case n = 2 we only have two types of connected graphs, the "triangle" with 3 
nodes and 3 edges, and the list with n nodes and n — 1 edges. 

3.1 Reduction to the Color Automorphism Prob- 
lem 

We start reducing this graph problem to a group one. 

Proposition 4. Testing isomorphism of graphs with bounded valence is polynomial- 
time reducible to the problem of determining generators for Aut e (X), where X is a 
connected graph with the same valence, and e is a distinguished edge. 

Proof. First, we show that if we can obtain a set of generators of Aut e (X) then 
we can test if two connected graphs of bounded valence are isomorphic. Let e\ G 
E(Xi), then for e 2 G E(X 2 ) we can test if it exists an isomorphism from X x to X 2 
sending e x to e 2 , as we can see in Algorithm 4, We build the new graph from the 
disjoint union X 1 U X 2 as follows: 

1. Insert new nodes v± in e\ and v 2 in e 2 . 

2. Join v\ to v 2 with a new edge e. 
□ 

Remark 1. The Algorithm 4 works because if such automorphism does exist, then 
any set of generators of Aut e (X) will contain one. 

Let X 1 and X 2 two connected trivalent graphs with !t ^ L vertices and build X as 
before, then A" is a connected trivalent graph with n vertices. The group Aut e (X) 
is determined through a natural sequence of successive "approximations", Aut e (X r ) 
where X r is the subgraph consisting of all vertices and all edges of X which appear 
in paths of length < r through e, more formally, if e = (a, b) 

V(X 1 ) = {a,b}, E(X 1 ) = {(a,b)} 
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Algorithm 4: Isomorphism of graphs of bounded valence 
Data: X±, X 2 connected graphs of bounded valence 
Result: Test if X\ and X 2 are isomorphic 
begin 

ei e Six,) 
for e 2 eS(X 2 ) do 

X «- BuildX(Xi,X 2 , ei, e 2 ) 
G <- Aut (X, e) 
for (j G G do 

if cr(wi) == v 2 then 
j return True 

return False 



V(X r ) = {be V(X) I 3 a e V(X r - X ) such that(a, b) e E(X)} 

E{X r ) = {(a, b) e E{X) I 3 a e V(X r ^) such that(a, b) e E(X)} 

There are natural homomorphisms 

7i r : Aut e (X r+1 ) — >■ Aut e (X r ) 

in which 7r r (o") is the restriction of o to X r . Now we construct a generating set for 
Aut e (X r+ i) given one for Aut e (X r ). 
For this we will solve two problems: 

(I) Find a set /C, of generators for K r , the kernel of n r . 

(II) Find a set 5, of generators for 7r r (^^ e (X r+ i)), the image of 7r r . 

So, the algorithm to compute Aut e (X) is: 



Then, if S' is any pullback of S in iti£ e (^ r +i), i- e - 7r r(>5 / ) = <S, then /C U S' 
generates Aut e (X r+1 ). 

Set Vr = V(X r ) \ V(X r _ 1 ). Each vertex in this set is connected to one, two 
or three vertices in X r . We codify this relationships as follows: Let A r denote the 
collection of all subsets of V r of size one, two, or three. Define 

/ : V r+1 ->• A r 

by f(v){w E V(X r ) I (v, w) E E(X)}, ie the neighbor set of v. 



3.1. REDUCTION TO THE COLOR AUTOMORPHISM PROBLEM 
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Algorithm 5: The group Aut e 



Data: A sequence of graphs Y, whose are the result of BuildX 
Result: Aut e (X) where X is the last graph in the sequence 

1 begin 

2 Aut e = (ei e 2 ) for X G Y do 
s <- Ker (X) 

4 5 Image (^4n£ e ,X) 

5 52 <- Pullback (5, X) 

6 _ 4u£ e = 52 U if 

7 return ylu£ e 



Definition 16. A pair u, v G V^+i, u ^ v, will be called twms if they have the 
same neighbor set 

Remark 2. There cannot be three distinct vertices with common neighbor set, 
because X is a trivalent graph. 

Proposition 5. 

a G Aut e (X r+1 ) /(<t(u)) = cr(f(v)) 
Proof. Let a G Aut e (X r+ i), then a preserves the set of edges so, 

w G f(v) & (w, v) e E(X r+1 ) & (a(w), a(v)) G E(X r+1 ) & a(w) G f{a{v)) 

therefore f(a(v)) = o~(f(v)). □ 

In particular, if a G ker(7r r ), a(f(v)) = f(v), then f(v) = f(a(v)), so either 
v = a(v) or v and a(v) are twins. Since a permutation in ker(7r r ) fixes neighbors 
sets of all v G V r+i , its only nontrivial action can involve switching twins. For each 
pair, u, v of twins in V r+ i, let (u v) G Sym(V(X r+ i)) be the transposition that 
switches u and v while it fixes all other points. Problem (I) is solved by taking 
{(u v) | such that u and v are twins } for /C. 

Proposition 6 (Tutte). For each r, Aut e (X r ) is a 2— group. 

Proof. Since | Aut e (X r+ i) | = | Im ir r \ • \ K r \, K r is the elementary abelian 2— group 
generated by the transpositions in each pair of twins and a subgroup of 2— group 
is a 2— group; an induction argument recovers. □ 

We note that if a G Aut e (X r ) is in ir r (Aut e (X r+ i)) , then it stabilizes each of the 
following three collections: 

1. The collection of edges ( considered as unordered pairs of vertices) connecting 
vertices in V r : 

A' = {(v u V2,) G A | (v 1: v 2 ) G E(X r+1 )} 
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2. The collection of subsets of V r that are neighbor sets of exactly one vertex 
in V r+1 . 

Ai = {a G A | a = f(v) for some unique v G V r+ i} 

3. The collection of subsets of V r that are neighbor sets of exactly two vertices 
in V r+ i, ie, the "fathers" of twins: 

A 2 = {a G A | a = f(v±) = f{v 2 ) for some V\ ^ v 2 } 

Even more, this condition characterizes the set ir r (Aut e (X r+ i)) . 

Proposition 7. ir r (Aut e (X r+1 )) is precisely the set of those a G Aut e (X r ) which 
stabilize each of the collections A 1 ,A 2 ,A'. 

Proof. We need only show that, if a stabilizes Ay, A 2 , A' then it does indeed extend 
to an element of Aut e (X r+1 ). For such cr, we define the extension as follows. For 
each "only child" v, f(v) G Ay we have a(f(V)) G Ay, so we send v to the "only 
child" v' such that f(v') = o~(f(v)). For each pair of twins v±, v 2 , f(v) G A 2 implies 
a(f(v)) G A 2 , so map {^i,^} to the twins sons of = o-(f(v 2 )) in either 

order. By construction, this extension stabilizes the set of edges between V(X r ) 
and V r+ ±. Note that | f(v) \—\ a(f(v)) \ also stabilizes the edges between "old 
points", because a stabilizes the set A'. □ 

Remark 2. We can not apply the Filter algorithm, because we have no guar- 
antee that the index of the group that stabilizes the sets Ai,A 2 and A' has a 
polynomial bound. 

Now, set B r = V(X r _i) U A r and G r = Aut e (X r ) and extend the action of G r 
to B r , ie, if v G B r , a(v) = {o-(w) \ w G B r }. To find S, we color each element of 
B r with one of five colors that distinguish: 

i) whether or not it is in A' 

ii) whether it is in Ay, or A 2 or neither. 

Only five colors are needed, since collections A' and A 2 are disjoint when r > 1, ie, 
let C — B r \ A', C\ — B r \ Ai and C 2 = B r \ A 2 , then the colors are: 

1. A'nAy 

2. A' n d 

3. CnAy 

4. C'nA 2 



5. c n d n C 2 
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We have a G n r (Aut e (X r+ i)) if and only if a preserves colors in A r . Thus, 
Trivalent Graph Isomorphism problem is polynomial-time reducible to the follow- 
ing: 

Problem 1. Input: A set of generators for a 2— subgroup G ofSym(A), where A 
is a colored set. 

Find: A set of generators for the subgroup {a G G \ a is color preserving }. 

3.2 The Color Automorphism Algorithm for 2- 
Groups 

With a view toward a recursive divide-and-conquer strategy, we generalize the 
Problem 1: 

Problem 2. Input: Generator for a 2-subgroup G of Sym(A), a G— stable subset 
B, and a G Sym(A) Find: C B (aG). 

where C B (T) = {a G K \ a preserves the color Vft G B} Problem 1 is an 
instance, with B = A, a = id, of the Problem 2, 

Let T, V subsets of Sym(A), and B, B' subsets of A, then we have: 

• C B (TUT') = C B (T)UC B (T') 

• C BUB ,(T) = C B (C B ,(K)) 

We observe first that if G is a subgroup of Sym(A), and B is a G— stable subset, 
then C B (G) is a subgroup of G. Also, we have the following lemma, needed for 
the recursive algorithm. 

Lemma 10. Let G be a subgroup of Sym(A), a G Sym(A) and B a G— stable 
subset of A such that C B (o~G) is not empty, then it is a left coset of the subgroup 
C B {G). 

Proof. If a' G C B (aG), then aG = a'G, because a' G oG. For r G G, b G B 
we have that a'{r{b)) has the same color as r(b), because r(b) G B. Thus a'r G 
C B {a'G) if and only if r G C B (G). That is, C b (a'G) = a'C B {G)u 

Thanks to the lemma 12 and 10, we can present an algorithm for the problem 

2, 

Observe that in the case G is transitive on B, we don't need to calculate C B (aH) 
and C b (o~tH), so, thanks to lemma 10, we know, when C B (o~H) and C B (arH) both 
are non-empty sets, that exists p\ and p2 such as 

C B (aH) = Pl a B (H) C B (arH) = p 2 C B (H) 

Then, form a generating set for C B (G) by adding p{ 1 to the generators of 
C B (H), and take p x as the coset representative for C B (aG). 
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Algorithm 6: C B (aG) 



Data: Coset aG C Sym(A) where A is a colored set and G a 2— group, and 

a G— stable subset, B, of A. 
Result: C B (aG) 
begin 

case B = {b} 

if a(b) ~ b then 
| Cb(cfG) — oG 
else 

iC B = 

case G is intransitive on B 
Let Bi a nontrivial orbit 
5 2 = B \ B 1 
_ C B (aG) = C B2 (C Bl (aG)) 

case G is transitive on B 

Let {B\, B 2 } a minimal G— block system 
Find the subgroup , H, of G that stabilizes B x 
Let r E G \ H 
_ C B (o~G) = C B2 (C Bl (aH)) U C B2 (C Bl (arH)) 

return C B (aG) 



3.3. STUDY OF COMPLEXITY 



27 



Proposition 8. The previous algorithm runs in polynomial time. 
Proof. It is an standard induction argument. □ 

Remark 3. In the next section we will see an upper bound of cost of this algorithm. 

3.3 Study of complexity 

In this section we will prove the trivalent graphs isomorphic problem is polynomial 
time, and we obtain an upper bound for the complexity using the Algorithm 4, 
In order to do this we will divide the algorithm into parts, first we will compute 
the complexity of the algorithm BuildX, then the complexity of the algorithm Aut, 
which we will separate in the algorithm Ker, Image and Pullback. Finally, we will 
add a "exponent" because we will do this for all e 2 G S(X 2 ) in the worst case, and 
we have a 0(3n) = 0(n) edges in X 2 , because the valence of X 2 is 3 and we have 
at most 3 edges for every node. 

3.3.1 Algorithm BuildX 

In this algorithm we will build a sequence of graphs and the cost of the algorithm 
is the cost of building this sequence. We will assume that the cost of know the 
neighbors of a vertex is 0(1), assuming that the cost of building the sequence 
is 0(n), because to build the sequence we only need build the final graph from 
the initial edge, and saving the resultant graph in each stage, so the cost of the 
algorithm is the number of stage and in the worst of case we will add a node at 
least in each stage, therefore we have a 2n + 2 node, thus the cost of BuildX is 
0{n). 

3.3.2 Algorithm Aut 

This algorithm is just do a 0(n) times the Algorithm 5 and in every stage of 5 we 
will run the algorithms Ker, Image and Pullback. 

Algorithm Ker 

The complexity of this algorithm is the complexity of build the function / : V r+ i — > 
A r and the complexity of search the pairs of nodes who have the same image. The 
cost of build the function is 0(n) assuming that the cost of searching the neighbors 
of a node is 0(1), and searching pairs in a vector with 0(n) elements is 0(n 2 ), 
because the vector is not sorted. So the total cost of the algorithm Ker is 0(n 2 ). 

Algorithm Image 

The complexity of this algorithm is dominated of build the sets A 1 ,A 2 ,A', the 
complexity of coloring the set B r and the complexity of the algorithm C B (crG). 



28 



CHAPTER 3. TRIVALENT CASE 



The complexity of build the sets Ai, A 2 , A' is 0(n 2 ), because for each node in 
V(X r ) we need to search if it is an "only child" or not. The complexity of coloring 
the set B r is 0(n 4 ) because the cost of coloring an element of B r is 0(n) and we 
have (9(n 3 ) elements in B r . 

The complexity of Cb{&G) is not so easy, we need the complexity of Algorithm 
2 and Algorithm 1, The complexity of Algorithm 2 is 0(n 3 ) [10] and we have 
seen that Algorithm 1 is 0(n 5 )0(n 2 ). With all of this we have that the recursive 
function of the complexity is: 



Therefore the complexity of Cb{ctG) is O (n 8 ). 
Algorithm Pullback 

This algorithm just do the procedure in the Proposition 7, so we only need to 
extend for a in the generator of the group S, which stabilizes the sets A%, A 2 and 
A' to a a 1 G Aut e (X r+ i). This extension can be done in 0(n) time and we have 
0(n) generators of S, so the cost of the algorithm Pullback is 0(n 2 ). 

Summarizing the complexity of the Algorithm 5 is 0(n) (0(n 2 ) + 0(n 8 ) + 0(n 2 )) = 
0(n 9 ) and the total complexity of the whole algorithm is 0(n 10 ) 

3.4 Improvements for the Implementation 

We have already seen that we can test if two trivalent graphs are isomorphic in 
polynomial time, now we will present some improvements of the algorithm to show 
that test if two trivalent graphs are isomorphic can be do in 0(n 3 log n) time. 

The first improvement is remove the triplets in A r . Recall that triplets are 
incorporated because the neighbor set of a vertex v € V r+ i could have cardinality 
3. This situation can be avoid by replacing each such v by a triangle with vertices at 
"level" r + 1, as we can see in Figure 3.4 and having labeled edges. The result is an 
edge-labeled graph denoted by X with sets of size less strict than 3. It is presumed 
that automorphisms map labeled edges to labeled edges, so the computation of 
Aut e (X) is the same as Aut e (X) except that B r need only include the subsets of 
V r of size 1 or 2; collection A\ is split into 

Ai a the collection of unlabeled edges connecting vertices in V r . 




1 if n = 1 

2 T (^) if G is intransitive on B 
0(n 7 ) + 4 (|) if G is transitive on B 



so in the worst case 



T(n) = 0(n 7 ) + 4 (f ) = ^ o = g " O (a%^ + 4^ = O (n 8 ) + n 2 = 0(n s ) 



Ai b the collection of labeled edges connecting vertices in V r . 



3.4. IMPROVEMENTS FOR THE IMPLEMENTATION 
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Figure 3.1: Replacing the triplets in the neighbor sets 



and an additional color is allowed for an element of B r to distinguish ether it 
is in Ai a , or An,, or neither. 

Also we reformulate B r := V r x V r in which (v, v ) has the color of v, while both 
(u, v) and (v , u) inherit the color of {u, v}. With this color assignment the reas- 
signment retains the identification of Im(n r ) with the color preserving subgroup. 

It is convenient to present 2-groups in a manner that facilitates several key 
computations. 

Definition 17 (Smooth generating sequence). Let G be a 2-group generated by 
{<7i, . . . , gk}, then the sequence (<?i, . . . , gk) will be called a smooth generating se- 
quence (SGS) for G if [Gu\ : G(i-i)] < 2, for % — 1, . . . , , k, where Gm = (gx, ■ ■ ■ , gi) 

If we have a 2-group G with a smooth generating sequence, then is easy con- 
struct an SGS for a subgroup H of index 2. 

Lemma 11. Let G a 2-group with {gi, . . . , gk} a SGS, and a subgroup H of index 
2. Let j = min{i \ gi £ H} and assign 



T 




for i 



Then, with /3i, . . . , constructed as above 



1. (/?!, . . . , j3k) is an SGS for H . 
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2. The time to compute this sequence is 0(k \ B \), assuming that a membership 
test requires time 0(\ B |). 

Proof. The timing is clear. Let Hi a = . . . ,/3j), then is clear that Hia = Gm 
for all i < j and H^ < G^) for all i > j. Then, for i > j , Qi ^ implies 
Pi £ H(i-i)- So for all i ^ j , [H {i) : ify-i)] > [G (i) : Using that {g u ...,g k } 

is a SGS, we have 

nf = i[G(j) : G(i-i)] =| C |= 2 I # |> 2 I |= m^ =1 [i/ (i ) : > nf =1 (^(i) : <%_i)] 

So, we conclude that [H^ : < 2, and = i/. 

Remark 4. One can see in [7j that SGS are preserved through homomorphism and 
lifting. 

3.4.1 Precomputing the Blocks 

The more difficult part of the algorithm is the recursive calls for Cb{o~G). The 
work can be reorganized so as to limit the number of distinct blocks, B, visited. 
These blocks form a tree that is precomputed and guides the recursion. 

Definition 18 (Structure tree). Let G be a 2-group acting on B. We call a binary 
tree T a structure tree for B with respect to G, T = Tree(B, G), if 

1. the set of leaves of T is B, 

2. the action of any a G G on B can be lift to an automorphism of T . 

It's important to remark, that we can precompute the entire structure tree for 
the initial (B, G) as follows: 

Lemma 12. Given a SGS (<?i, . . . , g/.) for G < Sym(B), \ B \= m, and let y) 
denote the time bound for union-find with x operations on y elements JT$. We have 
the next time bounds: 

1. The orbits of G in B can be computed in time O(km). 

2. If G B is transitive, a minimal block system {B^, Br} for G on B can be 
computed in time 0(<&(2km,2m)). 

3. A structure tree Tree(B, G) can be computed in time 0(^(4km, 4m)). 

4- Let G r = Aut e (X r ), B r = V r X V r and m r =| V r \, then a structure tree 
Tree(B r , G r ) can be constructed in time 0(^(4km r ,Am r ) + m%). 

5. The structure trees Tree(B r , G r ) for all stages, r = 1, . . . , n — 2 can be con- 
structed in total time 0(n 2 ). 

A proof of this lemma can be found in [7] 



3.4. IMPROVEMENTS FOR THE IMPLEMENTATION 
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Data: B, G 
Result: T = T(B, G) 
1 begin 

Let the root of T be B 
if | B |= 1 then 
return 

Find the orbits of G in B 
if G is transitive then 

Find a minimal block system {Bl, Br} for G on B 
Find the subgroup H of G that stabilizes 5^ 
Find re G\H 

return T = Tree(B L , H) U r( Tree(B L , H) ( joined by the new root B ) 
else 

Partition B into two nontrivial G-stable subsets B L , B R 
return T = T(B L , G) U T(B R , G) ( joined by the new root B) 



3.4.2 Other improvements 

When we compute Cb(cG), we can avoid deeper recursion, we can change the case 
1 where | B \— 1 by 

Case la (3 i : | B n ft |^| o[B) n ft |) : C b (<tG) := 

Case lb (3 i : 5 U a(B) C ft) : C B (<rG) := a G 

where ft denote the set of elements in A with color z. 

A non leaf B of Tree(B, G) is called transitive if the entry group, (7g acts 
transitively on the set {Bl, Br\ and intransitive, otherwise. A transitive node B 
is called color-transitive if the exist group C^(G^), acts transitively on {B L , B R }. 
With this definitions we can reformulate the conditions in the cases 2 and 3: 

Case 2 (B is intransitive) 

Case 3 (B is transitive) 

Let Q — U i<6 ft, then a node B of T = Tree(B, G) will be called inactive if 
BflQ = and arfwe otherwise. We say that the node B is visited each time a 
call to C B does not terminate in case la and lb. 

Definition 19. The subtree Tree p (B, G) of Tree(B, G), consisting of the active 
nodes is called the pruned tree 

Observe that the pruned tree still guides the recursion. 

We call an active node B facile if B is intransitive with exactly one active son, 
and nonfacile otherwise. Let A(B) denote the nearest non facile descendant of B. 
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Then, if a is color-preserving on A(B), it must be color-preserving on B. Hence, 
C^(aG B ) = C A ^(aG B ), so that we can pass to node A(B). With these facts we 
have the next algorithm for C B (aG). 

Data: T = Tree(B, G), an SGS for G 
Result: C B (aG) 
1 begin 

case 3 % :| B n ft |^| a(B) n ft |) 
return 

4 case 3i : B°cupa(B) C ft 

5 return a G 



6 
7 

8 
9 

10 
11 
12 
13 



case B is facile 

return Ca(b)(o~G) 

case B is intransitive 
_ return C BR C BL (aG) 

case B is transitive 

Find the subgroup H of G that stabilizes i?i 
Find r e G\H 

return C Br C Bl (<jH) U C Br C Bl {<ttH) 



Lemma 13. Assuming Tree(B r , G r ) is constructed as a complete binary tree, 
adding trivial nodes if it was necessary, it has at most 0(m r \ogm r ) active nodes. 

Proof. The pruned tree has at most 2m r leaves. Since Tree(G r , B r ) has m^ leaves, 
all paths within it, hence all paths within the pruned tree, have length at most 
2 log m r . 

Lemma 14. There are at most 2m r intransitive, nonfacile nodes in the pruned tree 

Proof. Each intransitive, nonfacile node has two sons in the pruned tree, which has 
< 2m r leaves. 

3.4.3 The Time Bound 

We know that the structure tree Tree(B r , G r ) for all r can be found in time 0(n 2 ), 
and pruning the tree, including the construction of A, takes O(nlogn). We also 
need that the entry groups for all nodes of the structure trees and the t's are 
computed inf 0(n 3 ) and transitivity is tested for all nodes in time <3(n 2 logn). 
With all of this we have the following theorem. 

Theorem 1. Let X be an n— vertex, connected, trivalent graph. Then Aut e (X) 
can be computed in time 0(n 3 ). 



3.4. IMPROVEMENTS FOR THE IMPLEMENTATION 
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A proof of this theorem can be founded in [7j 

So we have that the Aut e (X) can be computed in time 0(n 3 ) and we have a 
0(n) edges to test, so we derive the following: 

Theorem 2. Let Xx,X 2 be an n— vertex, connected, trivalent graphs. Then test if 
X\ and X 2 are isomorphics can be computed in time (9(n 4 ). 

This is a great improvement of the first bound that we found in the previous 
section 

3.4.4 More improvements 

In the implementation we made other improvements that don't reduce the theo- 
retical complexity, but they significantly reduce the efficient. The improvements 
are: 

• Don't compute the whole group Aut e (X), we only need to know if there is 
an element of Aut e (X) that transpose the two elected edges, so we only save 
the permutations who verify that. It shows especially with large n, when the 
group Aut e (X r ) is very large. 

• With the previous improvement we stop early in the case that X\ and X 2 don't 
be isomorphic, because we check every round if there is an isomorphism which 
exchanges X\ and X 2 . 

• Other improvement very useful is check if $E{X\) = Jj£ (X 2 ). This avoid a lot 
of computation in the case that X\ and X 2 are chosen at random. 

3.4.5 Other improvements that not be applied 

The theoretical complexity can be improved to 0(n 3 logn), but to this we have 
calculate previously the whole group Aut e (Xi) and we can't do the improvements 
showed previously; so although the low complexity, the computation time increases. 

Other improvement that not be applied, but it would be useful is the imple- 
mentation of the own class of permutation group. We note that the most of the 
time is waste in the algorithm while working with the group of permutations and 
is the part who more grows when n is bigger. We have two ideas to implement this 
class: 

1. Every permutation is an array of variable size, and when we multiply this 
permutation with other we only need append two elements to this array. 
Then to know the image of an element we only need know the position of 
this element in the array then the image of the element is 




a £ a 

the indexz of a is odd 



the indexi of a is pair 
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remember that in Python the first position in array is the position 0. 

This works because in the algorithm we always add a transposition who is 
disjoint of the previous transpositions. 

2. Every permutation is an array of size n and every position show us the image 
of this position, so the image of a is a[a — 1]. 

3.5 General Case 

In this short section, we show that the trivalent case is extensible to the general 
case, but we won't depth much as the trivalent case, because the complexity of the 
algorithm would be too big, although would keep polynomial still. 

We now consider graphs of valence bound by t, where t is fixed. It is important 
to fix t since otherwise the algorithm would not be polynomial. The procedure 
of the trivalent case generalizes, reducing the isomorphism problem to a certain 
color automorphism problem. So, if we show that in the general case Aut e (X r ) is 
a 2-group, then we provide the generalization. 

Therefore the reduction to determining the kernel and the image of 7i r remains 
intact, the set A now is the collection of all non-empty subsets of V(X r ) of size 
lower than t — 2 and the map / has the previous meaning. With all this we have 
that an element a G Aut e (X r+ i) now belongs to the kernel if and only if it stabilizes 
/ _1 (a) for a £ A. These sets form a partition of V(X r+1 ) \ V(X r ) and, K r is the 
direct product 

K r = Ti aeA Sym(f- 1 (a)) 

And each of these factors can be specified with at most two generators, so K r is a 
2-group. We can adapt the proof of Proposition 6 and we have that Aut e (X r ) is 
a 2-group. Now, using the rest of the arguments of the trivalent case, we get that 
the general case can be tested in polynomial time. Finally, a G Aut e (X r ) is in the 
image of n r if and only if a stabilizes the sets 

A s = {a G A | f-\a) = s} < s < t - 1 

and the set A' of new edges, we need 2t colors to color A. 



Chapter 4 



Implementation test 



Finally, in this chapter we will present some examples and tests using our own 
SAGE implementation. The first examples are to show that the code correctly 
works, and the test are to prove that runs in a reasonable time. Although the 
SAGE algorithm itself runs more quickly, they are comparable. 

Example 5. In this first example we will test two graphs who are isomorphics. The 
first graph, is the graph with edges {(1, 7), (1, 10), (2, 3), (2, 4), (3, 4), (4, 9), (5, 6), (6, 8), 
(7, 8), (7, 9), (8, 9)}, and the second is the graph with edges {(2, 3), (2, 10), (1, 7), (1, 4), 
(7, 4), (4, 9), (5, 6), (6, 8), (3, 8), (3, 9), (8, 9)}, Figure 4.1 and 4.2 shows this two graphs. 
The instructions in SAGE for create these graphs area 

sage: X3=Graph( [(1 , 7), (1, 10), (2, 3), (2, 4), (3, 4), (4, 9), 

(5,6), (6, 8), (7, 8), (7, 9), (8, 9)]) 
sage: X4=Graph( [(2, 3), (2, 10), (1, 7), (1, 4), (7, 4), (4, 9), 

(5, 6), (6, 8), (3, 8), (3, 9), (8, 9)]) 

Finally, we test if they are isomorphic: 

sage : Isomorphism(X3 ,X4, 10, Iso=True) 

1 — > 2 

2 — > 1 

3 — > 7 

4 — > 4 

5 — > 5 




Figure 4.1: The first graph of Example 5 
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Figure 4.3: The first graph of Example 6 

6 — > 6 

7 — > 3 

8 — > 8 

9 — > 9 

10 — > 10 
True 

Obviously, this produces an isomorphism between X\ and X2. 

Example 6. In the following example we will check two graphs which are not 
isomorphic. Figure 4.3 and 4.4 shows the two graphs to be checked. 
The instructions in SAGE for create this graphs area 

sage: Xl=Graph( [(1 , 7), (1, 8), (1, 10), (2, 3), (3, 6), 
(4, 5), (5, 6), (6, 10), (7,9), (7, 10), (8, 9)]) 

sage: X2=Graph( [(1 , 7), (1, 9), (2, 3), (2, 5), (2, 10), 
(4, 5), (4, 6), (4, 10), (6,8), (7, 8), (7, 10)]) 

sage: Isomorphism(Xl ,X2 , 10) 

False 

Now, we will present some graphics of different time tests. The first graphic, 
Figure 4.5, shows the time expend by the algorithm to test if two random graphs are 
isomorphic. The times are so small because if we take two random graphs probably 
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Figure 4.4: The second graph of Example 6 



Figure 4.5: Relation seconds-nodes with random graphs 

will take a different number of edges. Although in the major part of the example 
the algorithm ends because the graphs have a different time of edges, sometimes the 
algorithm enters in the loop, and in this case the algorithm is relatively efficient. 

To make the graphs more similar, we perform another test. In this example 
the degree of the first n — 1 nodes are the same and the last is chosen randomly, 
this way a third part of the graphs will be isomorphic. In this case, we also have 
reasonable times and the relation time-nodes can be seen in Figure 4.6 and Figure 
4.7 

Finally, we will show what happens if we test isomorphic graphs. In this case 
the time grows, but we can see in Figure 4.8 that the time grows more slowly 
than (x/10) 3 . The Figure 4.8 shows the comparison between the algorithm and 
the functions (x/10) 4 , (x/10) 3 , (x/10) 2 log(x/10), (x/10) 2 
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200 300 400 500 



Figure 4.6: Relation seconds-nodes with semirandom graphs 



100 200 300 400 



Figure 4.7: Relation seconds-nodes with semirandom graphs, with less than 2 sec- 
onds 



39 




Figure 4.8: Comparison between the algorithm and the functions 
(x/10) 4 , (x/10) 3 , (:r/10) 2 log(:r/10), (z/lO) 2 



Sumario 



En este trabajo haremos un estudio teorico de un algoritmo para isomorfismo 
de grafos de Valencia acotado propuesto por Eugene M. Luks(1982) y una im- 
plementation en el sistema SAGE de dicho algoritmo para el caso de Valencia 3. 
Este trabajo tiene 4 partes claramente diferenciadas, a saber: 

1. Preliminares 

2. Algoritmos previos 

3. Algoritmo principal 

4. Pruebas de la implementation 

Preliminares 

En los preliminares tenemos 3 partes: teorfa de grupos, teoria de grafos y teoria de 
la complejidad. 

En la primera presentamos las definiciones basicas de teoria de grupos centran- 
donos en el grupo de permutaciones, asi definiciones importantes que se ven son 
orbita, transitividad, G-block y G-block system. 

En la segunda, las definiciones basicas de teoria de grafos, como por ejemplo 
que es un isomorfismo entre grafos, tambien presentamos algunos resultados, como 
por ejemplo que el conjunto de automorfismos de un grafo forman un grupo. 

Finalmente en la tercera y ultima parte mostraremos conceptos generales sobre 
complejidad, algoritmos polinomiales y una idea intuitiva de reducibilidad. 

Algoritmos previos 

En este capftulo presentamos dos tipos de algoritmos, primero veremos algoritmos 
que se basan en teoria de grupos y luego otros dentro de la teoria de grafos. 

Algoritmos basicos en teorfa de grupos 

Lo mas importante y destacable son los dos lemas siguientes: 
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Lemma 15 (Furst-Hopcroft-Luks). Dado un conjunto de generadores para un sub- 
grupo G de S n se puede determinar en tiempo polinomico 

1. El orden de G . 

2. Saber si una permutation a pertenece a G . 

3. Los generadores de un subgrupo de G que sabemos que el indice en G tiene 
una cota polinomial y, tenemos un test de pertenencia que se puede ejecutar 
en tiempo polinomial. 

Lemma 16. Dado un conjunto de generadores para un subgrupo G de S n y una 
G—orbita B , se puede determinar en tiempo polinomial, un G— block system min- 
imal en B. 

Con el primer lema obtenemos el Algoritmo 9 y, con el segundo obtenemos el 
Algoritmo 10, que seran importantes en el algoritmo principal. 



Algorithm 7: Filter 



Data: a G G 

Result: Add a to his Ci 

begin 

for i e [0, n - 2] do 

if 3 7 G Ci : 7 _1 a G G i+1 then 
j a ^— 7 _1 a 

else 

add a to Ci 
return 

return 



Algoritmos basicos en teoria de grafos 

En esta parte se muestra un ejemplo ilustrando que no siempre es un problema 
complicado el saber si dos grafos son isomorfos. Para Mostramos un algoritmo 
que es O(n) para el isomorfismo de arboles filogeneticos, este lo presentamos en 
Algoritmo 11 

Algoritmo principal 

En este capitulo veremos el algoritmo principal. La idea general se muestra en el 
Algoritmo 12, 

La estructura de este capitulo esta dividida como sigue: 
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Algorithm 8: Smallest G— block which contains {l,w} 
Data: u ^ 1, G = (g ± , . . . , g m ) 

Result: The smallest G— block which contains {l,w} 

1 begin 

2 C 

3 Set f(a = a) Va G ^4 

4 Add u to C 

5 Set f{u)) = 1 

6 while C is nonempty do 

7 Delete (3 from C 

9 J ^ 

10 while j < ra do 

11 J + + 

12 7 <- 

13 (5 = /Sg'j 

14 if /( 7 )^ /(<J) then 

15 Ensure f(S) < 7(7) by interchanging 7 and 5 if necessary. 

16 for e : /(e) =/( 7 ) do 

L Set /(e)=/(*) 
is Add / (7) to C. 

19 return C 
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Algorithm 9: PhylogeneticTreelsomorphism 



Data: 7\ = (T 1: p 1: {u 1: U2})andT 2 = (T 2 , p 2 , {u 1: 

Result: Test if T\ and T 2 are isomorphic 

begin 

Set ip(p 1 (u i )) = p 2 (ui) Vz 
Nodes PostOrderlterator(Ti) 
w i— Nodes. next() 
while Nodes. hasNext() do 
if w is not a leaf then 
if <p(w) ==none then 
v child of w 

Set <p(w) = parent(ip(v)) 

for v child of w do 

if <p(w) ^parent ((/?(«)) then 
j return False 

return (p 



Algorithm 10: Isomorphism of graphs of bounded valence 
Data: Xi, X 2 connected graphs of bounded valence 
Result: Test if X± and X 2 are isomorphic 
i begin 

ei e Six,) 
for e 2 eS(X 2 ) do 

X BuildX(Xi,X 2 , ei, e 2 ) 
G <- Aut (X, e) 
for a E G do 

if (t(i'i) == w 2 then 
j return True 

return False 
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• Valencia 3. 

• Estudio de la complejidad para el caso de Valencia 3. 

• Mejoras para la implementacion. 

• Generalization al caso general 



Valencia 3 

En esta parte mostramos como funciona el algoritmo cuando los grafos tienen 
Valencia 3. Para eso, calculamos el grupo de automorfismos de un grafo, con este 
fin computamos una sucesion de grafos y creamos una serie de homomorfismos 
entre los grupos de automorfismos de esa sucesion de grafos. Aqui usaremos el 
Algoritmo 13 y obtendremos la sucesion de automorfismos que queriamos. 



Algorithm 11: The group Aut e 



Data: A sequence of graphs Y, whose are the result of BuildX 
Result: Aut e (X) where X is the last graph in the sequence 
i begin 

2 
3 
4 
5 
6 



Aut e = (ei e 2 ) for X e Y do 
K <- Ker (X) 
S Image (Aut e ,X) 
S2 <- Pullback (S,X) 
Aut P = S2 U K 



return Aut e 



Estudio de la complejidad 

En esta parte mostramos de manera mas detallada que el algoritmo anterior es 
polinomico y, que (9(n 10 ) es una cota superior del coste de dicho algoritmo. 

Mejora para la implementacion 

Dedicamos esta parte al estudio de mejoras en vistas de la implementacion, estas 
mejoras seran: 

• Reducir el tamaho de A r . 

• Representar los grupos mediante SGS. 

• Precomputar los bloques. 

• Otras mejoras. 
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Con estas mejoras conseguiremos que el algoritmo sea 0(n 4 ), en el peor de los 
casos. 

Caso general 

Finalmente veremos que para el caso general lo unico que necesitamos es compro- 
bar que el micleo de los homomorfismos sigue siendo un 2-grupo y, por lo tanto 
podremos aplicar todo lo demas, adaptandolo para cada Valencia. 

Pruebas de la implementacion 

Finalmente presentamos algunos tests realizados con la implementacion en el sis- 
tema SAGE, con estos mostramos que la cota superior de 0(n 4 ) no se alcanza y, 
que en el caso medio el algoritmo tiene un coste, informalmente, entre 0(n 3 ) y 
0(n 2 log n). 

El apendice mostramos la documentacion de la implementacion, aunque se re- 
comienda al lector visitar la pagina http:// www.alumnos.unican.es/aam35/sage- 
epy doc/index, html donde hay una detallada documentacion en HTML mucho mas 
facil y agil de usar. 
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